Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor bundle init #2074

Merged
merged 38 commits into from
Jan 20, 2025
Merged

Refactor bundle init #2074

merged 38 commits into from
Jan 20, 2025

Conversation

shreyas-goenka
Copy link
Contributor

@shreyas-goenka shreyas-goenka commented Jan 3, 2025

Summary of changes

This PR introduces three new abstractions:

  1. Resolver: Resolves which reader and writer to use for a template.
  2. Writer: Writes a template project to disk. Prompts the user if necessary.
  3. Reader: Reads a template specification from disk, built into the CLI or from GitHub.

Introducing these abstractions helps decouple reading a template from writing it. When I tried adding telemetry for the bundle init command, I noticed that the code in cmd/init.go was getting convoluted and hard to test. A future change could have accidentally logged PII when a user initialised a custom template.

Hedging against that risk is important here because we use a generic untyped map<string, string> representation in the backend to log telemetry for the databricks bundle init. Otherwise, we risk accidentally breaking our compliance with our centralization requirements.

Details

After this PR there are two classes of templates that can be initialized:

  1. A databricks template: This could be a builtin template or a template outside the CLI like mlops-stacks, which is still owned and managed by Databricks. These templates log their telemetry arguments and template name.
  2. A custom template: These are templates created by and managed by the end user. In these templates we do not log the template name and args. Instead a generic placeholder string of "custom" is logged in our telemetry system.

NOTE: The functionality of the databricks bundle init command remains the same after this PR. Only the internal abstractions used are changed.

Tests

New unit tests. Existing golden and unit tests. Also a fair bit of manual testing.

Copy link

github-actions bot commented Jan 3, 2025

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/cli

Inputs:

  • PR number: 2074
  • Commit SHA: ed24f2880b101b8a671f681d9e87ea332aa33ebd

Checks will be approved automatically on success.

libs/template/reader.go Show resolved Hide resolved
libs/template/reader.go Outdated Show resolved Hide resolved
libs/template/template.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please TAL at the remaining open comments before merging.

libs/template/reader.go Outdated Show resolved Hide resolved
libs/template/reader.go Show resolved Hide resolved
libs/template/reader_test.go Outdated Show resolved Hide resolved
libs/template/reader_test.go Outdated Show resolved Hide resolved
@shreyas-goenka shreyas-goenka added this pull request to the merge queue Jan 20, 2025
Merged via the queue into main with commit 41a21af Jan 20, 2025
9 checks passed
@shreyas-goenka shreyas-goenka deleted the refactor-bundle-init-squashed branch January 20, 2025 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants